Comparison of Evaluation Metrics for Sentence Boundary Detection
نویسندگان
چکیده
Automatic detection of sentences in speech is useful to enrich speech recognition output and ease subsequent language processing modules. In the recent NIST evaluations for this task, an error rate was used to evaluate system performance. A variety of metrics such as F-measure, ROC or DET curves have also been explored in other studies. This paper aims to take a closer look at the evaluation issue for sentence boundary detection. We employ different metrics, NIST error rate, classification error rate per word boundary, precision and recall, ROC curve, DET curve, precision-recall curve, and the area under the curves, to compare different system output. In addition, we use two different corpora in order to evaluate the impact of different imbalance in the data set. We show that it is helpful to use curves as well as a single performance metric, and that different curves show different advantages in visualization. Furthermore, the data skewness also has an impact on the metrics.
منابع مشابه
There's No Comparison: Reference-less Evaluation Metrics in Grammatical Error Correction
Current methods for automatically evaluating grammatical error correction (GEC) systems rely on gold-standard references. However, these methods suffer from penalizing grammatical edits that are correct but not in the gold standard. We show that reference-less grammaticality metrics correlate very strongly with human judgments and are competitive with the leading reference-based evaluation metr...
متن کاملDocument Parsing: Towards Realistic Syntactic Analysis
In this work we take a view of syntactic analysis as processing ‘raw’, running text instead of idealised, pre-segmented inputs—a task we dub document parsing. We observe the state of the art in sentence boundary detection and tokenisation, and their effects on syntactic parsing (for English), observing that common evaluation metrics are ill-suited for the comparison of an ‘end-to-end’ syntactic...
متن کاملبرچسبزنی نقش معنایی جملات فارسی با رویکرد یادگیری مبتنی بر حافظه
Abstract Extracting semantic roles is one of the major steps in representing text meaning. It refers to finding the semantic relations between a predicate and syntactic constituents in a sentence. In this paper we present a semantic role labeling system for Persian, using memory-based learning model and standard features. Our proposed system implements a two-phase architecture to first identify...
متن کاملEdge Detection Based On Nearest Neighbor Linear Cellular Automata Rules and Fuzzy Rule Based System
Edge Detection is an important task for sharpening the boundary of images to detect the region of interest. This paper applies a linear cellular automata rules and a Mamdani Fuzzy inference model for edge detection in both monochromatic and the RGB images. In the uniform cellular automata a transition matrix has been developed for edge detection. The Results have been compared to the ...
متن کاملEdge Detection Based On Nearest Neighbor Linear Cellular Automata Rules and Fuzzy Rule Based System
Edge Detection is an important task for sharpening the boundary of images to detect the region of interest. This paper applies a linear cellular automata rules and a Mamdani Fuzzy inference model for edge detection in both monochromatic and the RGB images. In the uniform cellular automata a transition matrix has been developed for edge detection. The Results have been compared to the ...
متن کامل